This static report is accompanied by a live R Shiny dashboard allowing one to test different tickers, exclude specific outlier periods, and adjust bootstrap simulation parameters.
Update this link!!!! Launch Interactive DashboardHealthcare expenditure represents a substantial share of economic activity in many high-income countries. In the United States, national health spending is projected to reach 20.3% of Gross Domestic Product (GDP) by 2033, up from 17.6% in 2023 (Keehan et al. 2025). Accurately modelling healthcare utilisation and expenditure is essential for forecasting budgetary pressures, planning service capacity, and designing policies that can meet the needs of an ageing population in an efficient and sustainable way.
The Agency for Healthcare Research and Quality (AHRQ) publishes the Medical Expenditure Panel Survey (MEPS), which provides detailed information on healthcare utilisation, associated expenditures, insurance coverage, and socio-demographic characteristics, all standardised to represent a full calendar year for each respondent (Agency for Healthcare Research and Quality 2023). These data enable one to quantify how healthcare use and costs vary across population groups and to identify factors associated with higher or lower spending, which is valuable for targeting interventions and evaluating potential policy reforms.
From a modelling perspective, healthcare expenditure data pose several
distributional challenges: they are non-negative, highly right-skewed (a few
patients incur massive costs), and contain a large proportion of individuals
with no recorded healthcare useage. Understanding both whether individuals access
healthcare and, conditional on doing so, how much is spent is crucial for
characterising demand and the resulting financial burden. This report uses 2012
MEPS data to investigate two primary outcomes related to physician services: the
number of doctor consultations (dvisit), capturing
healthcare utilisation, and the annual expenditures on doctor visits
(dvexpend), capturing the associated financial cost.
The 2012 MEPS dataset contains 10,638 observations on US adults aged 18–65. Tables 2.1 & 2.2 show the categorical and quantitative variables for both the full sample and a subsample of individuals with positive expenditure.
The covariates include demographic characteristics
(age, gender,
ethnicity, region,
and education), socioeconomic status
(income),
health indicators (BMI, self-reported
health general and mental health,
hypertension, and hyperlipidemia).
The number of non-physician visits (ndvisit) was
retained as a proxy for the latent propensity to seek care.
| Variable | Category | ||
|---|---|---|---|
gender
|
Female | 0.53 | 0.62 |
| Male | 0.47 | 0.38 | |
ethnicity
|
White | 0.70 | 0.71 |
| Black | 0.21 | 0.20 | |
| Native American | 0.01 | 0.01 | |
| Others | 0.09 | 0.08 | |
education_cat_detailed
|
Less than High School | 0.22 | 0.18 |
| High School Graduate | 0.31 | 0.29 | |
| Some College | 0.24 | 0.25 | |
| College Graduate | 0.15 | 0.17 | |
| Post-Graduate | 0.09 | 0.11 | |
region
|
Northeast | 0.15 | 0.16 |
| Midwest | 0.19 | 0.21 | |
| South | 0.39 | 0.38 | |
| West | 0.26 | 0.24 | |
hypertension
|
No | 0.75 | 0.65 |
| Yes | 0.25 | 0.35 | |
hyperlipidemia
|
No | 0.77 | 0.68 |
| Yes | 0.23 | 0.32 |
| Variable | Description | ||
|---|---|---|---|
bmi
|
Body mass index | 28.03 (6.41) | 28.68 (6.79) |
age
|
Age (years) | 40.25 (13.66) | 43.32 (13.55) |
education
|
Education (no. of years) | 12.77 (2.90) | 13.13 (2.81) |
income
|
Income (USD) | 60,817 (51,451) | 65,688 (54,115) |
dvisit
|
Doctor visits | 2.13 (3.63) | 3.94 (4.14) |
ndvisit
|
Non-doctor visits | 0.94 (2.91) | 1.51 (3.62) |
dvexpend
|
Doctor expenditure | 481.07 (1,646.79) | 889.71 (2,156.92) |
ndvexpend
|
Non-doctor expenditure | 159.59 (790.31) | 266.51 (1,038.92) |
Doctor-visit expenditure (dvexpend) is semi-continuous,
characterised by a point mass at zero (45.9% of observations)
and a heavy right skew among positive values. To address this, a two-part
model was used:
For both parts, the choice of covariates was informed by a preliminary Lasso
regression using the full set of candidate predictors, combined with prior
domain knowledge. ndvisit was retained a priori in both
parts as a proxy for latent propensity to use healthcare services. The final
two-part model is shown in Equations (2.1) & (2.2).
\[\begin{align}
% Part 1: Probit
\text{Part 1: }& \nonumber\\
\mathbb{P}\big[\text{dvexpend}_i > 0\big] &= \Phi(\eta_{1i}) \nonumber \\
\eta_{1i} &= \gamma_0 + \gamma_1 \text{Age}_i + \gamma_2 \text{Gender}_i + \gamma_3 \text{BMI}_i \nonumber \\
&\quad + \gamma_4 \text{Ethnicity}_i + \gamma_5 \text{Region}_i + \gamma_6 \text{Education}_i \nonumber \\
&\quad + \gamma_7 \text{General}_i + \gamma_8 \text{Mental}_i \nonumber \\
&\quad + \gamma_9 \text{Hypertension}_i + \gamma_{10} \text{Hyperlipidemia}_i \nonumber \\
&\quad + \gamma_{11} \text{Income}_i + \gamma_{12} \text{ndvisit}_i \tag{2.1} \\[10pt]
% Part 2: Gamma
\text{Part 2: }& \nonumber\\
\mathbb{E}\big[\text{dvexpend}_i &\mid \text{dvexpend}_i > 0\big] = \mu_i \nonumber \\
\ln(\mu_i) &= \beta_0 + \beta_1 \text{Age}_i + \beta_2 \text{Gender}_i \nonumber \\
&\quad + \beta_3 \text{Region}_i + \beta_4 \text{Education}_i \nonumber \\
&\quad + \beta_5 \text{General}_i + \beta_6 \text{Mental}_i \nonumber \\
&\quad + \beta_7 \text{Hypertension}_i + \beta_8 \text{Hyperlipidemia}_i \nonumber \\
&\quad + \beta_9 \text{Income}_i + \beta_{10} \text{ndvisit}_i \tag{2.2}
\end{align}\]
Where \(\eta_{1i}\) is the linear predictor for the Probit model,
\(\gamma_j\) are the coefficients for the Probit, and \(\beta_j\) are the
coefficients for the Gamma GLM with a log link.
The count of doctor visits (dvisit) exhibited significant
overdispersion (Variance: 13.1 > Mean:
2.1), violating the assumptions of a standard
Poisson model. A Negative Binomial regression was fitted to
account for the excess variability and lasso screening and prior domain knowledge
were used to inform covariate choice (Equation (2.3)).
A Poisson regression model with the same covariates as the final specification was also fitted and a formal overdispersion test, comparing the Poisson residual deviance to a \(\chi^2\) distribution with the corresponding residual degrees of freedom, yielded a Pearson dispersion of 4.18 (p < 0.001). The Poisson model had a higher Akaike Information Criterion (AIC) than the Negative Binomial model (AIC\(_\text{Poisson}\) = 50386 versus AIC\(_\text{NegBin}\) = 37334), further supporting the justification for a Negative Binomial model.
\[\begin{align} % Count model: Negative Binomial \text{Count model: } \text{dvisit}_i &\sim \text{NegBin}(\mu_i, \kappa) \nonumber \\[4pt] \log(\mu_i) &= \beta_0 + \beta_1 \text{Age}_i + \beta_2 \text{Gender}_i + \beta_3 \text{BMI}_i \nonumber \\ &\quad + \beta_4 \text{General}_i + \beta_5 \text{Mental}_i \nonumber \\ &\quad + \beta_6 \text{Ethnicity}_i + \beta_7 \text{Region}_i \nonumber \\ &\quad + \beta_8 \text{Hypertension}_i + \beta_9 \text{Hyperlipidemia}_i \nonumber \\ &\quad + \beta_{10} \text{Income}_i + \beta_{11} \text{Ndvisit}_i \nonumber \\ &\quad + \beta_{12} \text{Education}_i \tag{2.3} \end{align}\]
To investigate the dependence structure between the frequency of visits and the
intensity of expenditure beyond observed covariates, a copula approach was
employed following the framework of Marra and Radice (2025b). Using Sklar’s Theorem,
the joint cumulative distribution function, \(H(y_1,y_2)\), of utilisation
(\(Y_1\)) and expenditure (\(Y_2\)) can be modelled by coupling their marginal
distributions (\(F_1\) and \(F_2\)) via a copula function \(C\):
\[
H(y_1,y_2) = C(F_1(y_1), F_2(y_2); \theta)
\]
where \(\theta\) is the association parameter. A Gaussian copula was fitted to
the residuals of the marginal models for individuals with positive expendature
using the GJRM package in R and Kendall’s \(\tau\), a measure of
rank correlation, was estimated (Marra and Radice 2025a).
Both the Gamma and Negative Binomial models were fitted with a log link, so their coefficients are interpreted multiplicatively. For a predictor \(x_k\) with coefficient \(\beta_k\), a \(\Delta\)-unit increase in \(x_k\) multiplies the mean (conditional mean cost in the Gamma model and expected visit count in the Negative Binomial model) by \[ \exp(\Delta \beta_k). \] and the percent change is given by \[ 100\left(\exp(\Delta \beta_k)-1\right)\%. \]
| Term | Estimate | Std. Error | p-value | Estimate | Std. Error | p-value |
|---|---|---|---|---|---|---|
| (Intercept) | -0.780 | 0.084 | <0.001 | 5.634 | 0.129 | <0.001 |
| age | 0.008 | 0.001 | <0.001 | 0.009 | 0.002 | <0.001 |
| bmi | 0.006 | 0.002 | 0.009 | |||
| income | 0.000 | 0.000 | <0.001 | 0.000 | 0.000 | 0.002 |
| ndvisit | 0.113 | 0.008 | <0.001 | 0.070 | 0.007 | <0.001 |
| General Health (ref: Excellent) | ||||||
| Poor | 0.761 | 0.111 | <0.001 | 1.090 | 0.160 | <0.001 |
| Fair | 0.538 | 0.063 | <0.001 | 0.598 | 0.112 | <0.001 |
| Good | 0.301 | 0.046 | <0.001 | 0.306 | 0.089 | <0.001 |
| VGood | 0.159 | 0.041 | <0.001 | 0.023 | 0.083 | 0.779 |
| Mental Health (ref: Excellent) | ||||||
| Poor | 0.156 | 0.139 | 0.259 | 0.262 | 0.206 | 0.203 |
| Fair | 0.113 | 0.071 | 0.114 | 0.147 | 0.119 | 0.214 |
| Good | -0.053 | 0.044 | 0.223 | 0.057 | 0.081 | 0.482 |
| VGood | -0.034 | 0.039 | 0.385 | 0.087 | 0.075 | 0.243 |
| Education (ref: Less than High School) | ||||||
| High School Graduate | 0.082 | 0.037 | 0.027 | 0.142 | 0.078 | 0.069 |
| Some College | 0.241 | 0.040 | <0.001 | 0.251 | 0.081 | 0.002 |
| College Graduate | 0.358 | 0.047 | <0.001 | 0.290 | 0.092 | 0.002 |
| Post-Graduate | 0.397 | 0.058 | <0.001 | 0.409 | 0.107 | <0.001 |
| Ethnicity (ref: White) | ||||||
| Black | -0.054 | 0.034 | 0.113 | |||
| Native American | 0.179 | 0.149 | 0.232 | |||
| Others | -0.146 | 0.048 | 0.002 | |||
| Gender (ref: Female) | ||||||
| Male | -0.525 | 0.027 | <0.001 | -0.195 | 0.054 | <0.001 |
| Hyperlipidemia | ||||||
| Yes | 0.445 | 0.037 | <0.001 | 0.138 | 0.063 | 0.029 |
| Hypertension | ||||||
| Yes | 0.360 | 0.036 | <0.001 | 0.097 | 0.063 | 0.128 |
| Region (ref: Northeast) | ||||||
| Midwest | 0.034 | 0.045 | 0.455 | -0.017 | 0.084 | 0.838 |
| South | -0.098 | 0.040 | 0.013 | -0.192 | 0.077 | 0.012 |
| West | -0.178 | 0.043 | <0.001 | 0.006 | 0.082 | 0.945 |
| Term | Estimate | Std. Error | p-value |
|---|---|---|---|
| (Intercept) | -0.689 | 0.092 | <0.001 |
| age | 0.010 | 0.001 | <0.001 |
| bmi | 0.008 | 0.002 | <0.001 |
| income | 0.000 | 0.000 | <0.001 |
| ndvisit | 0.101 | 0.004 | <0.001 |
| General Health (ref: Excellent) | |||
| Poor | 1.178 | 0.095 | <0.001 |
| Fair | 0.812 | 0.064 | <0.001 |
| Good | 0.467 | 0.050 | <0.001 |
| VGood | 0.234 | 0.046 | <0.001 |
| Mental Health (ref: Excellent) | |||
| Poor | 0.588 | 0.119 | <0.001 |
| Fair | 0.368 | 0.068 | <0.001 |
| Good | 0.059 | 0.046 | 0.202 |
| VGood | 0.061 | 0.042 | 0.147 |
| Education (ref: Less than High School) | |||
| High School Graduate | 0.095 | 0.041 | 0.022 |
| Some College | 0.286 | 0.044 | <0.001 |
| College Graduate | 0.387 | 0.051 | <0.001 |
| Post-Graduate | 0.453 | 0.060 | <0.001 |
| Ethnicity (ref: White) | |||
| Black | -0.101 | 0.037 | 0.007 |
| Native American | -0.070 | 0.158 | 0.656 |
| Others | -0.126 | 0.053 | 0.018 |
| Gender (ref: Female) | |||
| Male | -0.629 | 0.029 | <0.001 |
| Hyperlipidemia | |||
| Yes | 0.369 | 0.037 | <0.001 |
| Hypertension | |||
| Yes | 0.318 | 0.037 | <0.001 |
| Region (ref: Northeast) | |||
| Midwest | 0.031 | 0.047 | 0.516 |
| South | -0.133 | 0.043 | 0.002 |
| West | -0.193 | 0.046 | <0.001 |
The two-part model effectively captures the dual processes of access and cost. In the first stage (Probit), females, older adults, and those with chronic conditions (hypertension, hyperlipidemia) were significantly more likely to incur expenditure (Table 3.1). In the second stage (Gamma GLM), conditional on seeking care, significant disparities in expenditure emerged. A 10-year increase in age is associated with approximately a 9% increase in conditional costs. Interestingly, males incur approximately 17.7% lower costs than females (p < 0.001). This finding aligns with work by Bertakis et al. (2000) who suggests women have higher healthcare engagement and diagnostic usage during reproductive years and for preventive screenings.
Self-reported general and mental health showed a strong effect with individuals who report “Poor” health incurring 197% higher costs than those in “Excellent” health. These patterns are consistent with clinical expectations - individuals who are sicker and/or more health-engagedboth enter the system more often and incur higher costs once they do so.
The two-part model achieves an Root Mean Square Error of $1599 and a Mean Absolute Error of $553. The predictive performance was robust at the aggregate level, with the total predicted cost within 1.8% of the actual total, although the model underestimates extreme outliers (Figure 3.1).
Figure 3.1: Distribution of prediction errors (Actual Cost − Predicted Cost) by expenditure category. The boxplots display the median and interquartile range of errors for patients grouped by their actual healthcare expenditure and the red dashed line represents a perfect prediction. The large positive errors for the highest cost categories show that the model fails to account for the full magnitude of catastrophic health expenditures.
As shown in Table 3.2, utilisation patterns mirrored
expenditure. Black and “Other” ethnic groups showed significantly lower visit
counts compared to White patients (Incident Rate Ratios < 1), consistent with
structural barriers to access and utilisation documented in the US healthcare
system (Waidmann and Rajan 2000; Macias-Konstantopoulos et al. 2023). Non-physician visits
(ndvisit) were positively associated with doctor visits
(approximately 11% increase per
visit), suggesting complementarity rather than substitution between provider
types.
The copula analysis of positive spenders revealed a moderate-to-strong positive dependence between frequency and severity. This implies that patients who have more doctor’s appointments also tend to have higher-than-expected costs per visit, suggesting a compounding resource burden for patients with higher health care demands.
As shown in Figure 3.2, there is a clear positive association between the number of visits and total expenditure. While some mechanical correlation is expected (more visits naturally equal more cost), the estimated Gaussian copula parameter (Kendall’s \(\tau =\) 0.522) indicates a dependence that extends beyond simple accumulation. This finding is in agreement with recent work by Marra and Radice (2025b) who used the same MEPS data, and also found that ‘heavy’ users are distinct not only in their visit frequency but also in their resource consumption intensity. This supports the use of a joint frequency-severity modelling framework over independent models.
Figure 3.2: Joint distribution of healthcare utilisation and expenditure for patients with at least one doctor’s visit. The vertical banding reflects the discrete nature of visit counts and the y-axis shows total doctor-visit expenditure (dvexpend) on a logarithmic scale to accommodate skewness. The overlaying positive trend highlights that expenditure increases with visit frequency. The model-estimated Kendall’s $ au$ quantifies the residual dependence, confirming that frequency and severity are positively correlated even after adjusting for covariates.
This analysis highlights the complex drivers of healthcare demand. The two-part expenditure model confirms that the decision to seek care and the resulting cost intensity are driven by overlapping but distinct magnitudes of effect. The significant gender and health-status gaps reinforce the need for risk-adjustment models that explicitly account for biological and systemic usage differences (Bertakis et al. 2000).
The utilisation analysis identified significant ethnic disparities in visit frequency. Even after controlling for income (a potential insurance proxy) and self-assessed health status, minority groups accessed physicians less frequently. These findings align with a review by Macias-Konstantopoulos et al. (2023) identifying structural racism and implicit bias as fundamental drivers of health inequities, where systemic factors limit access and quality of care independent of clinical need. However, it should be noted that the review by Macias-Konstantopoulos et al. (2023) was carried out under a critical review framework designed to ‘advance conceptual innovation’ rather than exhaustively assess the literature, and its findings should be interpreted within this context.
A key limitation of this study is the reliance on observational cross-sectional data, which precludes causal inference. While ndvisit was used as a proxy for care-seeking propensity, it may introduce endogeneity if unobserved health shocks drive both variable sets simultaneously. Furthermore, while the Gamma and Negative Binomial models handled skewness and overdispersion well, the diagnostic plots (Appendix A) show some strain in capturing the most extreme right-tail outliers—a common challenge in medical econometrics.
The copula analysis adds a critical dimension: frequency and severity are not independent. The positive Kendall’s \(\tau\) suggests that high-frequency patients are distinct not just in volume, but in the complexity of resources consumed per visit. Future frequency-severity modelling should therefore avoid independence assumptions to prevent the underestimation of aggregate risk for high-cost patient cohorts.
A key strength of this analysis is the alignment between the data structure and the chosen models. The two-part framework explicitly separates the extensive margin (any use) from the intensive margin (conditional costs), while the Negative Binomial model accommodates overdispersed visit counts. Variable selection was guided by Lasso screening, which helps to identify the most predictive covariates and reduce overfitting, before refitting unpenalised GLMs for interpretation. The models were estimated on a large, nationally representative dataset, and predictive checks suggest good calibration at the population level, with total predicted expenditure closely matching the observed total.
However, several limitations should be noted. First, the models describe associations, not causal effects. The observational nature of MEPS means that unobserved factors—such as underlying disease severity, insurance plan details, or provider characteristics—may confound the relationships between covariates and utilisation or cost. For example, ndvisit is interpreted as a proxy for latent propensity to seek care, but it may also be affected by the same unobserved health shocks that drive doctor visits, introducing simultaneity. Second, the linear predictors assume additive effects on the link scale and do not capture potential non-linearities or interactions (e.g. age by comorbidity); future work could relax this using splines or flexible machine-learning models such as tree-based methods. Third, despite the use of Gamma and Negative Binomial families, the models still struggle to reproduce the extreme right tail of the cost distribution, as evidenced by the large positive errors among the highest-cost patients.
Finally, although residuals and diagnostic plots do not reveal gross mis-specification, the usual GLM assumptions apply at the level of the conditional distribution: independence of observations, correct specification of the mean–variance relationship, and inclusion of the most important predictors. Violations of these assumptions, together with potential omitted variable bias, limit the extent to which conclusions can be generalised or interpreted in a causal sense. Nevertheless, the models provide a coherent descriptive summary of how doctor-visit utilisation and expenditure vary with observable demographic, socioeconomic, and health characteristics.
The Gamma model with log link handles the skewness of positive expenditures. The residual plots below show no severe deviations, though some ‘strain’ is visible at the highest fitted values.
Figure A.1: Diagnostic plots for the Gamma GLM (Expenditure).
Figure A.2: Deviance residuals versus fitted values for the Gamma GLM of conditional doctor-visit expenditure. Each point represents a respondent and the dashed line indicates a residual value of zero.
The Negative Binomial model accounts for overdispersion in visit counts. The diagnostics indicate a reasonable fit for the count data structure.
Figure A.3: Diagnostic plots for the Negative Binomial (Utilisation).
Figure A.4: Deviance residuals versus fitted values for the Negative Binomial regression of doctor-visit counts.
Deviance residuals (utilisation) show no strong systematic pattern, with most observations clustered near zero for low fitted visit counts. Variability increases slightly with the fitted mean and a few large positive residuals appear among the highest predicted counts, indicating some under-prediction for the heaviest users but no major violation of the Negative Binomial mean–variance structure.
An accessible HTML version of this report is available via a public GitHub page:
This report was created in R Markdown. The source code is open source and is available via: